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Background: Upper digestive endoscopy with biopsy and histopathological evaluation of the biopsy material is the standard 
method for diagnosing gastric cancer (GC). However, this procedure may not be widely available for screening in the developing 
world, whereas in developed countries endoscopy is frequently used without major clinical gain. There is a high demand 
for a simple and non-invasive test for selecting the individuals at increased risk that should undergo the endoscopic examination. 
Here, we studied the feasibility of a nanomaterial-based breath test for identifying GC among patients with gastric complaints. 

Methods: Alveolar exhaled breath samples from 130 patients with gastric complaints (37 GC/32 ulcers / 61 less severe conditions) 
that underwent endoscopy/biopsy were analyzed using nanomaterial-based sensors. Predictive models were built employing 
discriminant factor analysis (DFA) pattern recognition, and their stability against possible confounding factors (alcohol/tobacco 
consumption; tfelicobacter pylori] was tested. Classification success was determined (i) using leave-one-out cross-validation and 
(ii) by randomly blinding 25% of the samples as a validation set. Complementary chemical analysis of the breath samples was 
performed using gas chromatography coupled with mass spectrometry. 

Results: Three DFA models were developed that achieved excellent discrimination between the subpopulations: (i) GC vs benign 
gastric conditions, among all the patients (89% sensitivity; 90% specificity); (ii) early stage GC (I and II) vs late stage (III and IV), 
among GC patients (89% sensitivity; 94% specificity); and (iii) ulcer vs less severe, among benign conditions (84% sensitivity; 87% 
specificity). The models were insensitive against the tested confounding factors. Chemical analysis found that five volatile organic 
compounds (2-propenenitrile, 2-butoxy-ethanol, furfural, 6-methyl-5-hepten-2-one and isoprene) were significantly elevated in 
patients with GC and/or peptic ulcer, as compared with less severe gastric conditions. The concentrations both in the room air and 
in the breath samples were in the single p.p.b.^ range, except in the case of isoprene. 

Conclusion: The preliminary results of this pilot study could open a new and promising avenue to diagnose GC and distinguish it 
from other gastric diseases. It should be noted that the applied methods are complementary and the potential marker 
compounds identified by gas-chromatography/mass spectrometry are not necessarily responsible for the differences in the sensor 
responses. Although this pilot study does not allow drawing far-reaching conclusions, the encouraging preliminary results 
presented here have initiated a large multicentre clinical trial to confirm the observed patterns for GC and benign gastric 
conditions. 
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Gastric cancer (GC) is one of the most common causes of death 
from cancer worldwide, and most of the cases occur in developing 
countries (Ferlay et al, 2010). Unspecific clinical symptoms and the 
lack of defined risk factors often delay the diagnosis of the disease, 
leading to extremely poor prognosis and high rates of recurrence 
(Yasui et al, 2005; Pisters et al, 2008). Earlier diagnosis 
substantially improves the prognosis: 95% of patients with cancer 
that is confined to the inner lining of the stomach wall will survive 
longer than 5 years (Crew and Neugut, 2006). 

The standard method for diagnosing GC is upper digestive 
endoscopy combined with biopsy and histopathological evaluation 
of the biopsy samples. This method has a high diagnostic accuracy 
of 95 to 99% (Dooley et al, 1984), but is plagued with some 
prominent drawbacks, which limit its suitability for population- 
based screening. First, compliance is reduced by the invasive 
and relentless nature of this procedure (Chen et al, 2009); second, 
the method is relatively costly, and requires highly skilled medical 
staff. 

The incidence for GC varies widely in different regions of 
the world, reaching peak values in the countries of East Asia, 
Eastern Europe and South America (Pisters et al, 2008). In China, 
for instance, the age-adjusted incidence in men is 41.3/100 000 per 
year, whereas in the United States the corresponding incidence is 
almost one order of magnitude lower (5.7/3/100 000 per year) 
(Ferlay et al, 2010). The availability of upper endoscopy may be 
restricted in high-incidence areas, especially in the developing 
world, where population-wide screening would be necessary. 

Japan, the first country that has started a population-based 
GC-screening programme, stiU recommends photofluorography 
both for organised and opportunistic screening (Hamashima et al, 
2008), even though it involves exposure to X-ray irradiation. 
Indirect screening for atrophy, using blood pepsinogen tests, is 
recommended by the Asian-Pacific guidelines for high-risk 
populations (Fock et al, 2008), but so far the method has not 
been implemented for any organised screening programme. In 
areas of low GC incidence, on the other hand, endoscopy is 
frequently overused without major clinical gain, burdening the 
health budget. Hence, there is globally a high demand for a simple 
and non-invasive GC-screening test, to identify individuals at 
increased risk that should undergo an endoscopic examination. 



while avoiding unnecessary endoscopic investigations and costs in 
populations that are not at risk. 

Biomarkers that are derived from exhaled breath may provide a 
safe and elegant solution for mass GC screening. Over the past two 
decades, the analysis of volatile organic compounds (VOCs) has 
witnessed an enormous boost, as they have been described as a 
possible method to diagnose rapidly a variety of diseases, for 
example, cancers of the lung, breast, colon, prostate, liver, head- 
and-neck, as well as kidney disease, multiple sclerosis and 
Parkinson's disease (Gordon et al, 1985; O'Neill et al, 1988; 
Mendis et al, 1994; Phillips et al, 1994, 1999; Miekisch et al, 2004; 
Amann et al, 2007; Barash et al, 2009; Peng et al, 2009, 2010; 
Shuster et al, 2010; Song et al, 2010; Hakim et al, 2011; lonescu 
et al, 2011; Tisch et al, 2011; Broza et al, 2013). 

Haick and co-workers have developed highly sensitive, cross- 
reactive, nanomaterial-based gas sensors that could classify 
different types of cancer in the exhaled breath, using statistical 
pattern recognition methods, irrespective of the patients' gender, 
lifestyle, smoking habits and other confounding factors. The 
discriminative power of the sensor arrays was demonstrated in 
pilot studies, using limited patient cohorts (Peng et al, 2010; Tisch 
and Haick, 2010a-c; Hakim et al, 2011; Broza et al, 2013). Here, we 
demonstrate that arrays of nanomaterial-based sensors can 
distinguish the benign and malignant ulcers from other less severe 
gastric lesions, using breath samples of patients with gastric 
complaints. We further demonstrate that the results were not 
affected by important confounding factors such as alcohol/tobacco 
consumption and Helicobacter pylori {H. pylori) infection. 



PATIENTS AND IVIETHODS 



Patients. Breath samples were collected after written informed 
consent from 160 volunteers with gastric complaints, aged 27-73 
years, at the First Affiliated Hospital of Anhui Medical University 
(Hefei, China) (see Table 1). 

All volunteers underwent upper digestive endoscopy after 
recruitment according to the hospital's routine clinical protocol. 
Biopsy samples were taken for histopathology, if lesions (including 



Table 1. Clinical characteristic of all tested patients. 




Gastric cancer risl< factors 







Number of 
patients 


Age 
(years) 


Gender 
(M:F) 


Tobacco 
consumption 


Alcohol 
consumption 


H.pylori 
infection 


Diagnosis 


Gastric cancer 


Total 


37 


58.2 ±9.2 


28:9 


41% 


43% 


51% " 








Early stage (stages 1 and II) 


17 


57.6±11.7 


13:4 


35% 


35% 


65% 








Late stage (stages III 
and IV) 


18 


59.1 ±6.9 


13:5 


39% 


44% 


39% 




Endoscopy 
with biopsy 




Unknown stage 


2 
















Non-malignant 
gastric conditions 


Gastric ulcer 


32 


50.8± 14.2 


23:9 


44% 


47% 


59% 






Less severe 
gastric conditions 


Total 

Endoscopic abnormalities 
without ulceration 


61 
29 


51.4±8.8 
50.6 ±9.3 


30:31 
17:12 


21% 
24% 


21% 
24% 


21% " 
31% 




Endoscopy 
only 




No endoscopic 
abnormalities 


32 


52.2 ±8.3 


13:19 


19% 


19% 


13% . 
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ulceration of the stomach lining) were visually observed. Other- 
wise, the endoscopic abnormalities were assessed according to the 
Sydney classification system of endoscopic division (Tytgat, 1991). 
The following exclusion criteria were applied before sample 
collection: patients who have undergone gastric resection in the 
past; patients who were found to suffer from endoscopically 
detectable precancerous conditions (e.g. mucosal atrophy); 
and patients who took medication affecting gastric acid secretion 
(e.g. proton pump inhibitors) and/or antibiotics during an interval 
of 1 month before the breath test. The reason for the latter 
exclusion criterion for this pilot study was that previous 
medication could strongly affect the composition of the 
exhaled breath. 

After excluding of the breath samples of 30 patients who were 
damaged during storage and/or transport, the breath samples of 
130 patients were analyzed for this study: 37 GC patients (early 
stages I and II: 17; late stages III and IV: 18, without staging 
information: 2), 32 patients with benign gastric ulcers and 
61 patients with less severe gastric conditions (see Table 1). The 
less severe stomach conditions included cases with no endoscopic 
abnormalities (32) and with endoscopic abnormalities without 
ulceration (29) (see Table 1). The latter were classified by the 
treating physicians, according to the Sydney classification system 
of endoscopic division (Tytgat, 1991), as erythematous/exudative 
gastritis, flat erosive gastritis, raised erosive gastritis, hemorrhagic 
gastritis, enterogastric reflux gastritis or rugal hyperplastic gastritis. 
However, for this study we did not further subdivide the group of 
'less severe gastric conditions', because the detection accuracy for 
premalignant lesions purely on endoscopic appearance at white- 
light endoscopy is highly controversial (Atkins and Benedict, 1956; 
Carpenter and Talley, 1995). 

Ethical approval was obtained from the ethics committee of 
Anhui Medical University (Hefei, China), and the clinical trial was 
registered. The treatment decisions were based solely on the 
conventional diagnosis described above. Neither the patients nor 
their treating physicians were informed of the results of the breath 
tests. 

Collection of the breath samples. Exhaled alveolar breath was 
collected in a controlled manner, as described in Peng et al (2009, 
2010) and Hakim et al (2011). The volunteers were invited 
on specific collection days in groups of 10 to 20. None of 
the volunteers consumed food, tobacco or alcohol during an 
(overnight) 12-h interval before the breath collection. All 
volunteers were asked to rest for 1 h before the breath sampling 
and did not perform heavy physical exercise 24 h before giving the 
breath sample. All breath samples were collected in the same 
clinical environment and in duplicates (for the dual analysis, see 
section below) from each volunteer, and were stored in two-bed 
ORBO™ 420 Tenax TA sorption tubes for gas and vapor sampling 
(Sigma- Aldrich, St Louis, MO, USA). UnfUtered hospital air was 
sampled in the morning of each collection day. A detailed 
description of the breath collection, sample preparation and 
storage can be found in section Sl.l of the Supplementary Online 
Material (SOM). 

Characterisation of the breath samples. The breath samples were 
characterised in a dual approach, using two totally independent, 
complementary characterisation methods: (i) chemical analysis of 
the breath samples with the aim to identify the VOCs that show 
statistically different concentrations in the compared subpopula- 
tions, using gas-chromatography/mass spectrometry (GC-MS). 
Compound identification and quantification were achieved 
through measurement of external standards, as recommended in 
Bajtarevic et al (2009), Ligor et al (2009), Sponring et al (2009) and 
Filipiak et al (2010). The breath sample analysis with GC-MS is 
described in detail in section SI. 2 of the SOM. (ii) Characterisation 
of the breath samples with an array of 14 nanomaterial-based 



sensors, combined with a statistical pattern recognition algorithm 
(see section 'Statistical analysis'), with the aim of identifying 
specific patterns (the so-called breath prints) for GC and non- 
malignant gastric conditions, and the subcategories described 
above. The sensors included layers of gold nanoparticles with 11 
different organic ligands and layers of single-walled carbon 
nanotubes capped with four different organic overlayers (see 
SOM and Tisch and Haick (2010a,b,c)). The breath sample analysis 
with the nanomaterial-based sensor array is described in detail in 
section SI. 3 of the SOM. A description of the nanomaterial-based 
sensor array is given in section SI. 4 of the SOM. 

A small number of samples (from 30 patients) were damaged or 
destroyed because of breakage during the transport and storage. 

Study design. The primary aim of this cross-sectional compara- 
tive study was to distinguish GC patients from patients with benign 
gastric conditions who may present similar clinical symptoms. 
The secondary aim was to distinguish subpopulations in the 
malignant and non-malignant study groups. Conventional 
diagnosis served as reference standard. 

This single-centre pilot study with a limited patient cohort of 
160 (after application of the exclusion criteria, see section 
'Patients') was designed as a feasibility test of a nanomaterial- 
based breath test for GC, with the aim of delivering a proof of 
concept that would justify a large-scale, multicentre trial with a 
more realistic ration of malignant to non-malignant gastric 
conditions. 

The breath samples of 30 patients were damaged during storage 
and/or transport and could not be analyzed. Hence, the samples 
of 130 patients were analyzed for this study: 37 GC patients 
(early stages I and II: 17; late stages III and IV: 18; without staging 
information: 2), 32 patients with benign gastric ulcers and 61 
patients with less severe gastric conditions (see Table 1). 

Statistical analysis 

GC-MS. The VOCs that showed significant differences (cutoff 
P-value: 0.05) between the study groups were determined from the 
GC-MS results by means of the non-parametric WUcoxon/ 
Kruskal-Wallis test for populations whose data cannot be assumed 
to be normally distributed (WUkoxon, 1945), using JMP, version 
9.0.0 (SAS Institute Inc., Gary, NC, USA; 1989-2005). 

Sensor array. Each 14 sensor in the array responded to all (or to a 
certain subset) of the VOCs found in the exhaled breath samples. 
Specific patterns and predictive models for the studied gastric 
conditions were derived from the sensor array output, using 
discriminant factor analysis (DFA) (lonescu et al, 2002). 
Discriminant factor analysis is a linear, supervised pattern 
recognition method that effectively reduces the multidimensional 
experimental data, in which the classes to be discriminated are 
defined before the analysis is performed. Discriminant factor 
analysis was also used as a heuristic to select the sensors with the 
most relevant organic functionality out of the repertoire of 14, by 
filtering out non-contributing sensors. The reason for selecting a 
certain set of sensing features for a particular problem was directly 
derived from their ability to discriminate between the various 
classification groups. The input variables for DFA were the four 
features extracted from each of the 14 sensors' time-dependent 
resistance responses, that is, a total of 56 sensing features 
{see sections SI. 3, S1.4 and Supplemantary Table SI in the 
SOM). The four sensing features were related to the normalised 
resistance change at the beginning of the exposure, at the middle of 
the exposure and at the end of the exposure (with respect to the 
value of sensors resistance in vacuum before the exposure), and to 
the area beneath the time-dependent resistance response during 
the last third of the exposure period, as described in section SI. 3 in 
the SOM. Discriminant factor analysis determines the linear 
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combinations of the input variables such that the variance within 
each class is minimised and the variance between classes is 
maximised. The DFA output variables (i.e. canonical variables) are 
obtained in mutually orthogonal dimensions; the first canonical 
variable is the most powerful discriminating dimension. The 
classification success was estimated through leave-one-out cross- 
validation in terms of the number of true-positive, true-negative 
(TN), false-positive (FP) and false-negative (FN) predictions. 
Given n measurements, the model was computed using « — 1 
training vectors. The validation vector that was left out during the 
training phase was then projected onto the model, producing a 
classification result. All possibilities of leave-one-sample-out were 
considered, and the classification accuracy was estimated as the 
averaged performance over the n tests. Pattern recognition 
and data classification were conducted using MATLAB (The 
Math Works, Natick, MA, USA). Following the leave-one-out 
cross-validation, 25% of the samples were randomly blinded for an 
additional validation test, the DFA model was calculated again with 
the remaining 75% samples and the blind set was classified as 
described above. 



RESULTS 



Chemical analysis of the breath samples. The GC-MS analysis 
identified hundreds of different VOCs per individual breath 
sample, and 214 VOC were present in >85% of the breath 
samples. The GC-MS chromatograms of pristine Tenax material 
from unused ORBO^'^ 420 Tenax TA sorption tubes showed 
several prominent peaks corresponding to five VOCs that are 
probably contaminants of the Tenax sorbent material of the 
collection tubes. The VOCs were tentatively identified by spectral 
library match (Compounds library of the National Institute of 
Standards and Technology, Gaithersburg, MD USA, see section 
SI. 2 in the SOM) as methylene chloride, acetaldehyde, L-cysteine 
sulphonic acid, malonic acid and naphthalene (Amal et al, 2012). 
These substances were disregarded in the subsequent comparative 
analysis. Propanol, ethanol and methyl-isobutyl-ketone (also 
tentatively identified by spectral library match) were found in 
high abundance in the room air samples at the location of the 
breath tests that were taken on each collection day. These are 
typical hospital contaminants (Amann et al, 2010). However, they 
were found in much lower (almost negligible) abundance in < 85% 
of the breath samples, because of the effective lung washout prior 
to the breath collection that was routinely performed as an integral 
part of the one-step breath collection procedure (see section Sl.l in 
the SOM). Hence, 209 compounds were further analysed. Shapiro- 
WUk tests showed that the null hypothesis for normal distribution 
of the GC-MS data was not fulfilled for these 209 VOCs. Therefore, 
non-parametric Wilcoxon/Kruskal-Wallis tests with a cutoff value 
of P= 0.05 were used for the comparative analysis of the GC-MS 
data. We compared all possible pairs of the following groups in two 
data sets: GC; ulcer; less severe conditions; and non-malignant 
gastric conditions = ulcer -f less severe conditions (see Table 1). 
Initially 35 VOCs were found to be of statistical significance for the 
separation of the groups. In total, 27 VOCs were excluded after 
comparison with the room air and Tenax TA control samples, 
because they appeared in similar abundance or showed strong day- 
to-day fluctuation. The remaining 11 VOCs were tentatively 
identified through spectral library match as tetra-chlorobutyl 
acetate, 2-propenenitrile, l-methoxy-2-propanol, 2-butoxy-etha- 
nol, furfural, 2-pentyl acetate, 6-methyl-5-hepten-2-one, isoprene, 
4,5-dimethyl-nonane, 2-phenoxy-ethanol and 1-pentene. After 
measurement of calibration mixtures of high-purity external 
standards (Bajtarevic et al, 2009; Ligor et al, 2009; Sponring 
et al, 2009; Filipiak et al, 2010), we have excluded five VOCs: 



l-methoxy-2-propanol, 2-phenoxy-ethanol and 1-pentene were 
excluded because of retention time mismatch between breath 
samples and calibration standards; tetra-chlorobutyl acetate and 2- 
pentyl acetate were excluded because the measured concentrations 
in the breath samples were below the corresponding limit of 
quantification (LoQ). Furthermore, 4,5-dimethyl-nonane was 
excluded, because we were not able to obtain a high-purity 
calibration standard, and, hence, could not perform identity 
confirmation and quantification. 

The remaining five VOCs from the families of nitrUes, alcohol 
ethers, aldehydes, ketones and alkenes showed statistically 
significant differences in the concentration levels of the compared 
groups (see Table 2). Three compounds (2-propenenitrile, furfural 
and 6-methyl-5-hepten-2-one) were on average elevated in GC, as 
compared with the less severe gastric conditions without ulceration 
(P< 0.0001, see Table 2). Four VOCs (2-butoxy-ethanol, furfural, 
6-methyl-5-hepten-2-one and isoprene) distinguished between 
patients suffering from non-malignant gastric ulcer and patients 
with less severe gastric conditions, showing significantly higher 
concentration levels in the former (see Table 2). The VOCs, which 
were significantly elevated in patients with GC and/or peptic ulcer, 
as compared with less severe gastric conditions, were found in the 
room air in significantly lower concentrations (P<0.05). However, 
it should be noted that these VOCs were found both in the 
room air and in the breath samples in the single p.p.b.v range, 
except in the case of isoprene (see Table 2). Indeed, the average 
concentrations of 2-propenenitrile, 2-butoxy-ethanol, furfural and 
6-methyl-5-hepten-2-one in the breath samples of patients with 
less severe gastric conditions were not different from the room air 
concentrations. 

Identification and distinction of malignant and non-malignant 
gastric conditions using the nanomaterial-based sensors. The 

feasibility of the nanomaterial-based sensors to identify GC among 
patients with gastric complaints was demonstrated by building a 
DFA model based on all 130 characterised breath samples that 
discriminated well between the 37 GC patients and the 93 patients 
with non-malignant gastric conditions (see Table 1). Figure lA 
shows the DFA plot obtained from the responses of seven sensors 
with different organic functionalities (see Supplementary Table SI). 
The malignant and non-malignant gastric conditions formed two 
well-defined clusters in two-dimensional DFA space with no 
overlap and with few misclassified samples. The clusters were 
completely separated along the first canonical variable (CVl). High 
classification success of the first DFA model was verified through 
leave-one-out cross-validation. Table 3 lists the excellent cross- 
validation results for accuracy, sensitivity, specificity, positive 
predictive value (PPV) and negative predictive value (NPV). To 
further test the stability of this DFA model, we randomly blinded 
32 of the 130 samples (25%), calculated the DFA model again with 
the remaining training set of 98 samples and projected the blinded 
test set onto the model. Subsequent disclosure of the sample 
identity (6 GC and 26 non-malignant conditions) yielded 5 TN 
(GC classified as GC), 25 TN (benign conditions classified as 
benign conditions), 1 FP (benign condition classified as GC) and 
1 FN (GC classified as benign condition). The accuracy, sensitivity 
and specificity that were achieved in this additional blind 
validation test were 94%, 83% and 96%, respectively. Random 
blinding of different subsets of the data (totalling 25% of the 
samples irrespective of the sample identity) yielded similar results, 
demonstrating the stability of the proposed DFA model. 

Among the malignant gastric conditions, a second DFA model 
that was based on 35 of the 37 GC patients could completely 
separate the 17 early-stage GC cases from the 18 late-stage GC 
cases along CVl (see Figure IB). Two of the GC patients were 
excluded from this analysis, because no staging information was 
available for them (see Tables 1 and Supplemantary Table SI). 
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Figure 1. Discriminant factor analysis separating between patients with: (A) GC and non-malignant gastric conditions; (B) early- and late-stage 
GC; (C) gastric ulcer and less severe gastric conditions; (D) gastric cancer, gastric ulcer and less severe gastric conditions. The less severe gastric 
conditions include the endoscopic abnormalities described in the Sidney classification for gastritis, as well as no obvious gastric mucosal lesions. 
Every point represents one patient. 



The following classification success of the second DFA model was 
achieved: 91% accuracy, 89% sensitivity, 94% specificity, 94% PPV 
and 89% NPV (see Table 3). Random blinding of different subsets 
of the data (each blinded test set included 8 samples of the 35 
staged GC samples, totalling 25% of the GC samples irrespective of 
the sample identity) yielded on average 90% accuracy, 88% 
sensitivity and 93% specificity, with little variation between the 
different data sets. 

A third DFA model was build based on the samples of 98 
patients having non-malignant gastric conditions for distinguish- 
ing between benign gastric ulcer and gastric conditions without 
ulceration (32 and 61 samples, respectively, see Table 1 and 
Supplementary Table SI). Figure IC shows that clusters were 
formed for the two subpopulations along CVl, but the clusters had 
some overlap and were more spread out than for the first two DFA 
models. Nevertheless, leave-one-out cross-validation yielded rea- 
sonable values for accuracy (86%), sensitivity (84%), specificity 
(87%), PPV (77%) and NPV (91%) (see Table 3). Randomly 
blinding a subset of 23 samples yielded 83% accuracy, 83% 
sensitivity and 83% specificity. However, repeating the blind test 
with different randomly chosen blinded test sets of 23 samples 
showed some variability of the results (classification accuracies 
varied between 65 and 83%), indicating that the third DFA model 
is less stable than the first two models. 

Figure ID shows that a DFA model based on all 130 samples 
could distinguish between GC, gastric ulcer and less severe gastric 
conditions in one step with 77% classification accuracy. The 
separation between the three clusters requires the calculation of the 
first and the second canonical variable (CVl and CV2). Randomly 
blinding different subsets containing a total of 32 samples yielded 



stable results for the classification accuracy with little variability 
(on average 75%). 

In addition, we tested a DFA model based on the 61 samples 
from patients with less severe gastric conditions for distinguishing 
between patients with endoscopic abnormalities without ulceration 
and patients with no visible endoscopic abnormalities (see 
Supplementary Table SI). Table 3 shows that a high classification 
success could be achieved also for this case. 

Finally, we have explored the possible effect of the most 
important confounding factors on the sensing results. In this study, 
we have paid special attention to the possible effects of tobacco 
and alcohol consumption, as well as the presence or absence of 
H. pylori infection. Tobacco consumption among the participants 
of this study varied between 19 and 44%, depending on the 
subpopulation, and alcohol consumption varied between 19 and 
47% for the different subpopulations (see Table 1). The effect of 
tobacco consumption on the composition of the exhaled breath 
has been studied by mass spectrometry methods, and a variety of 
breath VOCs has been associated with tobacco consumption 
(see for example Amann et al (2010), Fuchs et al (2010), Kischkel 
et al (2010), and references therein). We have therefore carefully 
verified that none of the DFA models that were developed for this 
study was sensitive to either tobacco or alcohol consumption of the 
participants. For this purpose, we applied each DFA model 
separately to the two subpopulations for which it was developed, 
and defined consumers and non-consumers of alcohol or tobacco 
as the two classes to be separated. The DFA clusters showed 
complete overlap for all models, and the classification was correct 
in only 38-54% of cases (i.e. arbitrary). The percentage of 
H. py/ori-infected participants showed a stronger variation 
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(between 13 and 65% per subpopulation, see Table 1). We have 
verified that all the DFA models that were used in this study were 
insensitive also to H. pylori infection, with complete cluster overlap 
and arbitrary classification. However, we were able to develop a 
new DFA model for separating between infected and infection-free 
participants within the group of cancer patients. The study of a 
VOC-based breath test for H. pylori infection is currently 
underway and will be published elsewhere. 



DISCUSSION 



Chemical composition of the breath samples. In the following 
section, we will attempt to explain the possible biochemical origin 
of some of these compounds. However, the origin of other breath 
VOCs cannot yet be easily understood. 

2-Propenenitrile (acrylonitrUe) can be found as environmental 
pollutant in cigarettes and in car exhaust, and was classified as a 
Class 2B carcinogen (i.e. possibly carcinogenic) by The Interna- 
tional Agency for Research on Cancer (lARC) (1999). As such this 
compound could reach the blood after inhalation and could be 
accumulated in the body, yielding the observed higher relative 
amounts of the compound in the GC samples, and, thus, indicating 
a much increased GC risk in subjects who were exposed to the 
substance. Therefore, it is important to consider the effect of 
inhaling exogenous compounds on the blood, as any change in the 
composition of blood can affect the body's metabolism and, hence, 
the breath VOC profile (Hakim et al, 2012). Exogenous 
compounds that increase the risk for certain types of cancer could 
be considered as exogenous cancer markers. Interestingly, the 
opposite trend has recently been reported for lung cancer patients: 
2-propenenitrile was found at decreased levels in the breath of 
smokers with lung cancer, as compared with healthy smokers 
(Kischkel et al, 2010). 

Of the four VOCs observed at significantly increased levels in 
the breath of ulcer patients (2-butoxy-ethanol, furfural, 6-methyT 
5-hepten-2-one and isoprene), only isoprene could be explained in 
terms of endogenous physical pathways. Isoprene is formed along 
the mevalonic pathway as part of the cholesterol biosynthesis and 
is always present in high and varying concentration in human 
exhaled breath (Miekisch et al, 2004). Also, isoprene concentra- 
tions show a strong dependency on physical activity, CO (cardiac 
output) and minute ventilation, and may be re-distributed between 
peripheral and central compartments (King et al, 2010). In this 
study, all participants were asked to rest for 1 h before the breath 
sampling and did not perform heavy physical exercise 24 h before 
providing the breath sample, to minimise the effect of physical 
exercise on the blood and, hence, on the exhaled breath. It was 
recently shown that H. pylori uses the host cholesterol in defence 
against antibiotics (McGee et al, 2011), which would result in 
elevated cholesterol biosynthesis and could explain the observed 
higher levels of exhaled isoprene in gastric ulcer patients. In this 
context, it should be mentioned that decreased isoprene levels were 
also observed in the breath of lung cancer patients ( Wehinger et al, 
2007; Bajtarevic et al, 2009). 

An additional comparison with the (hospital) room air levels at 
the collection site showed that 2-propenenitrile, 2-butoxy-ethanol, 
furfural and 6-methyl-5-hepten-2-one were present at similar 
levels in the room air as in the breath of the patients with less 
severe gastric conditions. It is therefore possible that the results 
were confounded through previous inhalation or uptake of the four 
VOCs from the hospital environment, storage in the body and 
subsequent gradual expiration. In this case, the concentration in 
the exhaled breath might be correlated with the period of previous 
exposure, rather than with the disease state. 2-Butoxyethanol is 
most likely exogenous, as it occurs in paints and in many cleaning 
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products for industrial and home use, and could be taken up to the 
body through inhalation. Furfural occurs in many foods and 
flavourings, but was reported to be is toxic with a median lethal 
dose of 300-500 mg kg ^ ^ in mice after oral intake (Hoydonckx 
et al, 2007). 6-Methyl-5-hepten-2-one is used as artificial 
flavouring, and could be taken up with food. The increased levels 
of these compounds in ulcer patients could indicate that exposure 
increases the risk for this disease, and, hence, they could be 
candidates as exogenous markers of ulcer. However, a larger study 
is necessary to verify these observations. 

The breath prints of malignant and non-malignant gastric 
conditions that were derived from the nanomaterial-based 
sensors. The study design simulated a possible future breath test, 
based on a single breath sample, for the screening and differential 
diagnosis of gastric conditions that could be used to recommend 
upper digestive endoscopy, if indicated, or determine therapeutic 
intervention for less severe gastric conditions (see Figure 1, top 
panel). Breath samples would be taken from a wide population 
with gastric complaints and analyzed using the sensor array. 
The first part of the test would be to check for malignancy, using 
the DFA model that can distinguish between malignant and non- 
malignant gastric conditions. In the second part of the test, the 
malignant and non-malignant populations would be further 
distinguished: (i) the GC stage would be determined in the GC- 
positive subjects, using the DFA model that can distinguish 
between early- and late-stage GC; (ii) the GC-negative subjects 
would be tested for non-malignant ulcer, using the DFA model 
that can distinguish between benign gastric conditions with and 
without ulceration. In addition, we could distinguish in this study 
between patients with endoscopic abnormalities without ulceration 
and patients with no visible endoscopic abnormalities. This 
encouraging preliminary result could eventually lead to a breath 
test for gastritis, which would be of high clinical interest. However, 
in the absence of histology data for these two subpopulations, we 
cannot be certain how well the endoscopic observations correlated 
with clinically significant histological differences. An extended 



multicentre study that includes biopsies for all patients is underway 
and wiU be published elsewhere. 

It is of special relevance that all DFA models were insensitive to 
the typical breath VOC patterns that are generated through 
tobacco/alcohol consumption and H. pylori infection, as these 
could be important confounding factors among patients with 
gastric complaints. 

The results of the GC breath test correlated very well with the 
results of upper digestive endoscopy and biopsy (see Figure lA and 
Table 3). Furthermore, the excellent classification success of early- 
and late-stage GC (see Figure IB and Table 3) might be of high 
clinical interest for the subsequent targeted endoscopic examina- 
tion of early-stage GC, supporting swift, lifesaving treatment 
decisions. Furthermore, the possibility to discriminate between 
benign ulcer and less significant lesions of the stomach may 
facilitate an appropriate selection of ulcer patients for endoscopy 
(see Figure IC and Table 3). 

The one-step distinction between different lesions (GC, ulcer 
disease, less significant lesions) is of principal relevance as it 
potentially allows simultaneous confirmation of one disease while 
excluding another. This may have important clinical consequences. 

A breath test for distinguishing GC from less severe gastric 
conditions without ulceration (including gastritis) could be 
combined with conventional endoscopy with biopsy to increase 
the diagnostic yield for gastritis-like carcinomas, corresponding to 
type lib (flat) GC in the Japanese classification. These subtle 
mucosal changes are visible only as slight surface irregularities and 
may be hard to distinguish from unspecific or inflammatory 
lesions by conventional endoscopy (Suzuki et al, 2006). As the 
breath test is fast and potentially inexpensive, and results could in 
principle be obtained in real time, the test could be repeated in case 
of a positive result, and, hence, the number of FP test results could 
be reduced through test repetition. 

Although this small-scale pilot study does not allow drawing 
far-reaching conclusions, the encouraging preliminary results 
presented here have initiated a large multicentre clinical trial to 
confirm the observed breath prints. The large-scale study, which is 
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currently underway, also includes very early-stage GC and 
precancerous conditions. 

Note that the validation methods used in this pilot study 
((i) leave-one-out cross-validation and (ii) randomly blinding 25% 
of the data after building the DFA models with the entire data) did 
not accommodate an independent sample set that would be 
necessary for blind validation. The limited sample size of this 
pilot study did not allow reducing the training set for building the 
DFA models. The recently initiated large-scale multicentre 
trial addresses this point and accommodates a test set comprising 
25% of the collected samples, which are being blinded 
before the analysis and which will be disclosed only after the 
classification of the blind samples by the developed models. This 
blind validation is designed to validate the results that were 
presented here. 

It should furthermore be noted that previous medication can 
strongly affect the chemical composition of the exhaled breath. We 
have therefore excluded from this pilot study patients who took 
medication affecting gastric acid secretion (e.g. proton pump 
inhibitors) and/or antibiotics during an interval of 1 month before 
the breath test. However, a future breath test for GC should be 
stable against the typical medication that might be consumed by 
patients with gastric complaints, to become interesting for clinical 
use. The possible confounding effects of previous medication are 
currently being investigated in our recently initiated multicentre 
clinical trial, with the aim of achieving stability against the most 
important medications. 

The fundamental differences of the breath print characterisation 
with the nanomaterial-based sensor array and the chemical 
analysis of the breath samples by GC-MS. It is important to note 
that the breath print characterisation with the nanomaterial-based 
sensors is fundamentally different from the chemical analysis of the 
breath samples by GC-MS. The two methods should be considered 
as completely independent, complementary approaches. The 
patterns derived from the nanomaterial-based sensors usually 
show a better discriminative ability than the chemical analysis by 
GC-MS (Peng et al, 2010; Hakim et al, 2011). This can be 
understood in terms of the fundamental differences between the 
two methods. The sensors used in this study were broadly 
crossreactive, that is, all of the sensors are expected to respond to a 
wide variety of breath VOCs, with much overlap in the sensitivities 
to specific VOCs. While the responses to the same compound at a 
certain concentration are individually different between the 
constituent sensors, because of the chemical diversity of the 
organic sorbent phase, the signals to the constituent VOCs that are 
present in the breath sample are in good approximation additive 
(Konvalina and Haick, 2012). Hence, the overall signal of one 
sensor can be expected to stem from a total ~ p.p.m.v. amount of 
VOCs. Among the VOCs that contribute to the sensors' signals 
could very well be compounds that cannot be detected or 
quantified by GC-MS, because their individual concentrations lie 
below the LoD or LoQ of our GC-MS equipment. It is reasonable 
to assume that the sensors' responses are less affected by noise than 
the detected p.p.b.v. concentrations of the separate compounds in 
the GC-MS analysis. On the other hand, the nanomaterial-based 
sensors are typically more sensitive to certain classes of VOCs, and 
less sensitive to other classes, because of the nature of the organic 
materials of the chemiresistive layers that adsorb the VOCs from 
the breath samples (see section S1.4 in the SOM). Hence, the signal 
from the nanomaterial-based sensors in the study might not stem 
from the same VOCs that were detected by GC-MS. For example, 
none of the DFA models in this study was sensitive to the smoking 
habits of the study population, even though the GC-MS analysis 
identified 2-propenenitrile, a known smoking marker, as distin- 
guishing VOC between GC patients and subjects having less severe 
gastric conditions. 



Possible future relevance for clinical practice. Upper digestive 
endoscopy with biopsy is currently the standard for diagnosing GC 
and distinguishing it from benign gastric conditions that may 
present similar clinical symptoms. A major drawback is the limited 
patient compliance with this highly accurate but invasive and 
costly procedure (Chen et al, 2009). The survival from GC is poor; 
in Europe, the 5-year survival is below 25%, and the situation is 
even worse in the United States (Verdecchia et al, 2007). Also, 
some early GCs provide very little optical contrast with the 
surrounding mucosa, so that they could be missed during a routine 
endoscopic examination. Tumour markers could in principle be 
used to complement (unambiguous) endoscopic findings, but the 
yield of the traditional tumour markers (CEA, CA 19-9, CA 242 
and CA 72-4) for the detection of GC is low (Carpelan-Holmstrom 
et al, 2002). 

A future nanomaterial-based breath test for the simultaneous 
detection of malignant and benign gastric conditions in 
patients with unspecific gastric complaints would be suited to 
precede and complement upper digestive endoscopy with 
biopsy. Breath testing is fast, simple and non-invasive. Hence, 
the test would be highly acceptable by patients and would therefore 
be highly suited for identifying at-risk individuals that should 
undergo further endoscopic investigations, while avoiding unne- 
cessary invasive procedures. In this setting, breath testing could 
indicate malignancy before the endoscopic examination, thus 
allowing a well-directed, systematic search for malignant lesions, 
including hidden and small lesions that could otherwise be missed 
during endoscopy/biopsy. The results of the breath test could 
potentially provide valuable complementary information for 
distinguishing malignant and benign ulceration with identical 
morphology. 

Conclusion. We have presented initial data demonstrating that 
VOC-based breath prints detected by nanomaterial-based sensors 
could be used for identification of GC and distinction from benign 
stomach ulcers and less severe stomach conditions, irrespective of 
important confounding factors such as tobacco/alcohol consump- 
tion and H. pylori infection. Chemical analysis of the breath 
samples showed that five VOCs (2-propenenitrile, 2-butoxy- 
ethanol, furfural and 6-methyl-5-hepten-2-one and isoprene) were 
significantly elevated in patients with GC and/or peptic ulcer, as 
compared with less severe gastric conditions. The concentrations 
both in the ambient (hospital) air and in the breath samples were 
in the single p.p.b.^ range, except in the case of isoprene. Therefore, 
it cannot be excluded that 2-propenenitrile, 2-butoxy-ethanol, 
furfural and 6-methyl-5-hepten-2-one stem from acute or chronic 
accumulation in the body because of exposure to the hospital 
atmosphere. It should be noted that the applied methods were 
complementary and the potential marker compounds identified by 
GC-MS were not necessarily responsible for the differences in the 
sensor responses. A GC breath test could be developed in the 
future that could be used to precede and complement conventional 
upper digestive endoscopy with biopsy as low-price high-scale 
screening tool for identifying individuals who should be referred 
for the endoscopic examination. However, this small-scale pilot 
study does not allow drawing far-reaching conclusions. The 
encouraging preliminary results presented here have initiated a 
multicentre clinical trial with considerably increased sample size to 
confirm the observed breath prints. This study is currently 
underway and will be published elsewhere. 
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